Overview

Dataset statistics

Number of variables14
Number of observations10682
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.1 MiB
Average record size in memory112.0 B

Variable types

Numeric10
Categorical4

Warnings

year_of_Journey has constant value "2019" Constant
month_of_Journey is highly correlated with week_of_JourneyHigh correlation
week_of_Journey is highly correlated with month_of_JourneyHigh correlation
Source is highly correlated with year_of_JourneyHigh correlation
month_of_Journey is highly correlated with year_of_JourneyHigh correlation
year_of_Journey is highly correlated with Source and 2 other fieldsHigh correlation
Total_Stops is highly correlated with year_of_JourneyHigh correlation
df_index is uniformly distributed Uniform
df_index has unique values Unique
Airline has 319 (3.0%) zeros Zeros
Destination has 2871 (26.9%) zeros Zeros
minute_of_Journey has 2062 (19.3%) zeros Zeros

Reproduction

Analysis started2021-08-27 05:35:40.787974
Analysis finished2021-08-27 05:35:54.717358
Duration13.93 seconds
Software versionpandas-profiling v2.11.0
Download configurationconfig.yaml

Variables

df_index
Real number (ℝ≥0)

UNIFORM
UNIQUE

Distinct10682
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5340.65381
Minimum0
Maximum10682
Zeros1
Zeros (%)< 0.1%
Memory size83.6 KiB
2021-08-27T11:05:54.800990image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile534.05
Q12670.25
median5340.5
Q38010.75
95-th percentile10147.95
Maximum10682
Range10682
Interquartile range (IQR)5340.5

Descriptive statistics

Standard deviation3083.997576
Coefficient of variation (CV)0.5774569343
Kurtosis-1.199877355
Mean5340.65381
Median Absolute Deviation (MAD)2670.5
Skewness0.0001753776066
Sum57048864
Variance9511041.05
MonotocityStrictly increasing
2021-08-27T11:05:54.918960image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
20471
 
< 0.1%
13461
 
< 0.1%
54401
 
< 0.1%
95341
 
< 0.1%
33871
 
< 0.1%
13381
 
< 0.1%
74811
 
< 0.1%
54321
 
< 0.1%
95261
 
< 0.1%
33791
 
< 0.1%
Other values (10672)10672
99.9%
ValueCountFrequency (%)
01
< 0.1%
11
< 0.1%
21
< 0.1%
31
< 0.1%
41
< 0.1%
ValueCountFrequency (%)
106821
< 0.1%
106811
< 0.1%
106801
< 0.1%
106791
< 0.1%
106781
< 0.1%

Airline
Real number (ℝ≥0)

ZEROS

Distinct12
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.966204831
Minimum0
Maximum11
Zeros319
Zeros (%)3.0%
Memory size83.6 KiB
2021-08-27T11:05:55.016845image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q13
median4
Q34
95-th percentile8
Maximum11
Range11
Interquartile range (IQR)1

Descriptive statistics

Standard deviation2.352090225
Coefficient of variation (CV)0.5930329688
Kurtosis0.3664876888
Mean3.966204831
Median Absolute Deviation (MAD)1
Skewness0.7310573319
Sum42367
Variance5.532328428
MonotocityNot monotonic
2021-08-27T11:05:55.095107image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
43849
36.0%
32053
19.2%
11751
16.4%
61196
 
11.2%
8818
 
7.7%
10479
 
4.5%
0319
 
3.0%
2194
 
1.8%
713
 
0.1%
56
 
0.1%
Other values (2)4
 
< 0.1%
ValueCountFrequency (%)
0319
 
3.0%
11751
16.4%
2194
 
1.8%
32053
19.2%
43849
36.0%
ValueCountFrequency (%)
113
 
< 0.1%
10479
4.5%
91
 
< 0.1%
8818
7.7%
713
 
0.1%

Source
Categorical

HIGH CORRELATION

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size83.6 KiB
2
4536 
3
2871 
0
2197 
4
697 
1
 
381

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters10682
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row3
3rd row2
4th row3
5th row0
ValueCountFrequency (%)
24536
42.5%
32871
26.9%
02197
20.6%
4697
 
6.5%
1381
 
3.6%
2021-08-27T11:05:55.267836image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
2021-08-27T11:05:55.322909image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
24536
42.5%
32871
26.9%
02197
20.6%
4697
 
6.5%
1381
 
3.6%

Most occurring characters

ValueCountFrequency (%)
24536
42.5%
32871
26.9%
02197
20.6%
4697
 
6.5%
1381
 
3.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number10682
100.0%

Most frequent character per category

ValueCountFrequency (%)
24536
42.5%
32871
26.9%
02197
20.6%
4697
 
6.5%
1381
 
3.6%

Most occurring scripts

ValueCountFrequency (%)
Common10682
100.0%

Most frequent character per script

ValueCountFrequency (%)
24536
42.5%
32871
26.9%
02197
20.6%
4697
 
6.5%
1381
 
3.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII10682
100.0%

Most frequent character per block

ValueCountFrequency (%)
24536
42.5%
32871
26.9%
02197
20.6%
4697
 
6.5%
1381
 
3.6%

Destination
Real number (ℝ≥0)

ZEROS

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.436154278
Minimum0
Maximum5
Zeros2871
Zeros (%)26.9%
Memory size83.6 KiB
2021-08-27T11:05:55.388260image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q32
95-th percentile5
Maximum5
Range5
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.47484469
Coefficient of variation (CV)1.026940289
Kurtosis0.6319566897
Mean1.436154278
Median Absolute Deviation (MAD)1
Skewness1.244045862
Sum15341
Variance2.175166859
MonotocityNot monotonic
2021-08-27T11:05:55.460108image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
14536
42.5%
02871
26.9%
21265
 
11.8%
5932
 
8.7%
3697
 
6.5%
4381
 
3.6%
ValueCountFrequency (%)
02871
26.9%
14536
42.5%
21265
 
11.8%
3697
 
6.5%
4381
 
3.6%
ValueCountFrequency (%)
5932
 
8.7%
4381
 
3.6%
3697
 
6.5%
21265
 
11.8%
14536
42.5%

Duration
Real number (ℝ≥0)

Distinct368
Distinct (%)3.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean643.0205018
Minimum5
Maximum2860
Zeros0
Zeros (%)0.0%
Memory size83.6 KiB
2021-08-27T11:05:55.548412image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum5
5-th percentile90
Q1170
median520
Q3930
95-th percentile1615
Maximum2860
Range2855
Interquartile range (IQR)760

Descriptive statistics

Standard deviation507.8301335
Coefficient of variation (CV)0.7897572971
Kurtosis-0.1663341759
Mean643.0205018
Median Absolute Deviation (MAD)350
Skewness0.8614112576
Sum6868745
Variance257891.4445
MonotocityNot monotonic
2021-08-27T11:05:55.649567image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
170550
 
5.1%
90386
 
3.6%
165337
 
3.2%
175337
 
3.2%
155329
 
3.1%
180261
 
2.4%
140238
 
2.2%
150220
 
2.1%
160158
 
1.5%
85135
 
1.3%
Other values (358)7731
72.4%
ValueCountFrequency (%)
51
 
< 0.1%
7524
 
0.2%
8061
 
0.6%
85135
 
1.3%
90386
3.6%
ValueCountFrequency (%)
28601
< 0.1%
28201
< 0.1%
25651
< 0.1%
25251
< 0.1%
24801
< 0.1%

Total_Stops
Categorical

HIGH CORRELATION

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size83.6 KiB
0
5625 
4
3491 
1
1520 
2
 
45
3
 
1

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters10682
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row4
2nd row1
3rd row1
4th row0
5th row0
ValueCountFrequency (%)
05625
52.7%
43491
32.7%
11520
 
14.2%
245
 
0.4%
31
 
< 0.1%
2021-08-27T11:05:55.831803image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
2021-08-27T11:05:55.885053image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
05625
52.7%
43491
32.7%
11520
 
14.2%
245
 
0.4%
31
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
05625
52.7%
43491
32.7%
11520
 
14.2%
245
 
0.4%
31
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number10682
100.0%

Most frequent character per category

ValueCountFrequency (%)
05625
52.7%
43491
32.7%
11520
 
14.2%
245
 
0.4%
31
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common10682
100.0%

Most frequent character per script

ValueCountFrequency (%)
05625
52.7%
43491
32.7%
11520
 
14.2%
245
 
0.4%
31
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII10682
100.0%

Most frequent character per block

ValueCountFrequency (%)
05625
52.7%
43491
32.7%
11520
 
14.2%
245
 
0.4%
31
 
< 0.1%

Additional_Info
Real number (ℝ≥0)

Distinct10
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7.392997566
Minimum0
Maximum9
Zeros19
Zeros (%)0.2%
Memory size83.6 KiB
2021-08-27T11:05:55.953494image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile5
Q18
median8
Q38
95-th percentile8
Maximum9
Range9
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1.214253744
Coefficient of variation (CV)0.1642437635
Kurtosis2.508230984
Mean7.392997566
Median Absolute Deviation (MAD)0
Skewness-1.779688529
Sum78972
Variance1.474412154
MonotocityNot monotonic
2021-08-27T11:05:56.025698image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
88344
78.1%
51982
 
18.6%
7320
 
3.0%
019
 
0.2%
47
 
0.1%
34
 
< 0.1%
63
 
< 0.1%
21
 
< 0.1%
91
 
< 0.1%
11
 
< 0.1%
ValueCountFrequency (%)
019
0.2%
11
 
< 0.1%
21
 
< 0.1%
34
 
< 0.1%
47
 
0.1%
ValueCountFrequency (%)
91
 
< 0.1%
88344
78.1%
7320
 
3.0%
63
 
< 0.1%
51982
 
18.6%

year_of_Journey
Categorical

CONSTANT
HIGH CORRELATION
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size83.6 KiB
2019
10682 

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters42728
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2019
2nd row2019
3rd row2019
4th row2019
5th row2019
ValueCountFrequency (%)
201910682
100.0%
2021-08-27T11:05:56.184840image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
2021-08-27T11:05:56.234922image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
201910682
100.0%

Most occurring characters

ValueCountFrequency (%)
210682
25.0%
010682
25.0%
110682
25.0%
910682
25.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number42728
100.0%

Most frequent character per category

ValueCountFrequency (%)
210682
25.0%
010682
25.0%
110682
25.0%
910682
25.0%

Most occurring scripts

ValueCountFrequency (%)
Common42728
100.0%

Most frequent character per script

ValueCountFrequency (%)
210682
25.0%
010682
25.0%
110682
25.0%
910682
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII42728
100.0%

Most frequent character per block

ValueCountFrequency (%)
210682
25.0%
010682
25.0%
110682
25.0%
910682
25.0%

month_of_Journey
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size83.6 KiB
5
3465 
6
3414 
3
2724 
4
1079 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters10682
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row3
2nd row5
3rd row6
4th row5
5th row3
ValueCountFrequency (%)
53465
32.4%
63414
32.0%
32724
25.5%
41079
 
10.1%
2021-08-27T11:05:56.377314image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
2021-08-27T11:05:56.430808image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
53465
32.4%
63414
32.0%
32724
25.5%
41079
 
10.1%

Most occurring characters

ValueCountFrequency (%)
53465
32.4%
63414
32.0%
32724
25.5%
41079
 
10.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number10682
100.0%

Most frequent character per category

ValueCountFrequency (%)
53465
32.4%
63414
32.0%
32724
25.5%
41079
 
10.1%

Most occurring scripts

ValueCountFrequency (%)
Common10682
100.0%

Most frequent character per script

ValueCountFrequency (%)
53465
32.4%
63414
32.0%
32724
25.5%
41079
 
10.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII10682
100.0%

Most frequent character per block

ValueCountFrequency (%)
53465
32.4%
63414
32.0%
32724
25.5%
41079
 
10.1%

week_of_Journey
Real number (ℝ≥0)

HIGH CORRELATION

Distinct18
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean18.41368658
Minimum9
Maximum26
Zeros0
Zeros (%)0.0%
Memory size83.6 KiB
2021-08-27T11:05:56.496683image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum9
5-th percentile10
Q113
median20
Q323
95-th percentile26
Maximum26
Range17
Interquartile range (IQR)10

Descriptive statistics

Standard deviation5.227373156
Coefficient of variation (CV)0.2838852032
Kurtosis-1.147488123
Mean18.41368658
Median Absolute Deviation (MAD)4
Skewness-0.4050218433
Sum196695
Variance27.32543011
MonotocityNot monotonic
2021-08-27T11:05:56.574759image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=18)
ValueCountFrequency (%)
231331
12.5%
191024
9.6%
20909
8.5%
12902
 
8.4%
24821
 
7.7%
21783
 
7.3%
22724
 
6.8%
26706
 
6.6%
10705
 
6.6%
9514
 
4.8%
Other values (8)2263
21.2%
ValueCountFrequency (%)
9514
4.8%
10705
6.6%
11304
 
2.8%
12902
8.4%
13299
 
2.8%
ValueCountFrequency (%)
26706
6.6%
25214
 
2.0%
24821
7.7%
231331
12.5%
22724
6.8%

day_of_Journey
Real number (ℝ≥0)

Distinct10
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean13.5090807
Minimum1
Maximum27
Zeros0
Zeros (%)0.0%
Memory size83.6 KiB
2021-08-27T11:05:56.652810image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q16
median12
Q321
95-th percentile27
Maximum27
Range26
Interquartile range (IQR)15

Descriptive statistics

Standard deviation8.479363137
Coefficient of variation (CV)0.6276787686
Kurtosis-1.272847284
Mean13.5090807
Median Absolute Deviation (MAD)6
Skewness0.1181743134
Sum144304
Variance71.89959921
MonotocityNot monotonic
2021-08-27T11:05:56.731814image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
91406
13.2%
61287
12.0%
271130
10.6%
211111
10.4%
11075
10.1%
241052
9.8%
15984
9.2%
12957
9.0%
3848
7.9%
18832
7.8%
ValueCountFrequency (%)
11075
10.1%
3848
7.9%
61287
12.0%
91406
13.2%
12957
9.0%
ValueCountFrequency (%)
271130
10.6%
241052
9.8%
211111
10.4%
18832
7.8%
15984
9.2%

hour_of_Journey
Real number (ℝ≥0)

Distinct24
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12.49101292
Minimum0
Maximum23
Zeros40
Zeros (%)0.4%
Memory size83.6 KiB
2021-08-27T11:05:56.814491image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile5
Q18
median11
Q318
95-th percentile22
Maximum23
Range23
Interquartile range (IQR)10

Descriptive statistics

Standard deviation5.748820008
Coefficient of variation (CV)0.4602364953
Kurtosis-1.194929286
Mean12.49101292
Median Absolute Deviation (MAD)5
Skewness0.1129237509
Sum133429
Variance33.04893149
MonotocityNot monotonic
2021-08-27T11:05:56.898427image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=24)
ValueCountFrequency (%)
9915
 
8.6%
7867
 
8.1%
8697
 
6.5%
17695
 
6.5%
6687
 
6.4%
20651
 
6.1%
5629
 
5.9%
11580
 
5.4%
19567
 
5.3%
10536
 
5.0%
Other values (14)3858
36.1%
ValueCountFrequency (%)
040
 
0.4%
137
 
0.3%
2194
1.8%
324
 
0.2%
4170
1.6%
ValueCountFrequency (%)
23161
 
1.5%
22387
3.6%
21492
4.6%
20651
6.1%
19567
5.3%

minute_of_Journey
Real number (ℝ≥0)

ZEROS

Distinct12
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean24.40928665
Minimum0
Maximum55
Zeros2062
Zeros (%)19.3%
Memory size83.6 KiB
2021-08-27T11:05:57.147485image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q15
median25
Q340
95-th percentile55
Maximum55
Range55
Interquartile range (IQR)35

Descriptive statistics

Standard deviation18.76780146
Coefficient of variation (CV)0.768879555
Kurtosis-1.292665532
Mean24.40928665
Median Absolute Deviation (MAD)20
Skewness0.1672339983
Sum260740
Variance352.2303716
MonotocityNot monotonic
2021-08-27T11:05:57.223876image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
02062
19.3%
301215
11.4%
551058
9.9%
10890
8.3%
45875
8.2%
5773
 
7.2%
15692
 
6.5%
25691
 
6.5%
20666
 
6.2%
35665
 
6.2%
Other values (2)1095
10.3%
ValueCountFrequency (%)
02062
19.3%
5773
 
7.2%
10890
8.3%
15692
 
6.5%
20666
 
6.2%
ValueCountFrequency (%)
551058
9.9%
50591
5.5%
45875
8.2%
40504
4.7%
35665
6.2%

Price
Real number (ℝ≥0)

Distinct1870
Distinct (%)17.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean9087.214567
Minimum1759
Maximum79512
Zeros0
Zeros (%)0.0%
Memory size83.6 KiB
2021-08-27T11:05:57.324134image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1759
5-th percentile3543
Q15277
median8372
Q312373
95-th percentile15764
Maximum79512
Range77753
Interquartile range (IQR)7096

Descriptive statistics

Standard deviation4611.54881
Coefficient of variation (CV)0.5074766064
Kurtosis13.30193677
Mean9087.214567
Median Absolute Deviation (MAD)3382
Skewness1.812404555
Sum97069626
Variance21266382.43
MonotocityNot monotonic
2021-08-27T11:05:57.434709image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10262258
 
2.4%
10844212
 
2.0%
7229162
 
1.5%
4804160
 
1.5%
4823131
 
1.2%
14714109
 
1.0%
3943104
 
1.0%
1512993
 
0.9%
384191
 
0.9%
1289886
 
0.8%
Other values (1860)9276
86.8%
ValueCountFrequency (%)
17594
 
< 0.1%
18401
 
< 0.1%
196536
0.3%
201735
0.3%
205010
 
0.1%
ValueCountFrequency (%)
795121
 
< 0.1%
624271
 
< 0.1%
572091
 
< 0.1%
548263
< 0.1%
522851
 
< 0.1%

Interactions

2021-08-27T11:05:44.735627image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-08-27T11:05:44.844592image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-08-27T11:05:44.946818image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-08-27T11:05:45.055695image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-08-27T11:05:45.158620image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-08-27T11:05:45.270104image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-08-27T11:05:45.374141image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-08-27T11:05:45.478086image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-08-27T11:05:45.581907image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-08-27T11:05:45.697407image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-08-27T11:05:45.804611image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-08-27T11:05:45.902916image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-08-27T11:05:46.008700image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-08-27T11:05:46.232287image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-08-27T11:05:46.356152image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-08-27T11:05:46.464109image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-08-27T11:05:46.565243image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-08-27T11:05:46.668189image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-08-27T11:05:46.778717image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-08-27T11:05:46.879201image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-08-27T11:05:46.975637image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-08-27T11:05:47.087462image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-08-27T11:05:47.193808image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-08-27T11:05:47.308679image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-08-27T11:05:47.418109image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-08-27T11:05:47.523983image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-08-27T11:05:47.632801image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-08-27T11:05:47.739875image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-08-27T11:05:47.855178image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-08-27T11:05:47.967477image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-08-27T11:05:48.078470image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-08-27T11:05:48.187295image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-08-27T11:05:48.302872image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-08-27T11:05:48.415023image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-08-27T11:05:48.517658image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-08-27T11:05:48.611273image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-08-27T11:05:48.724623image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-08-27T11:05:48.826510image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-08-27T11:05:48.925143image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-08-27T11:05:49.026494image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-08-27T11:05:49.130802image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-08-27T11:05:49.244111image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-08-27T11:05:49.345575image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-08-27T11:05:49.434772image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-08-27T11:05:49.538321image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-08-27T11:05:49.753761image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-08-27T11:05:49.858594image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-08-27T11:05:49.950577image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-08-27T11:05:50.054562image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-08-27T11:05:50.171013image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-08-27T11:05:50.275085image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-08-27T11:05:50.375516image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-08-27T11:05:50.467526image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-08-27T11:05:50.562703image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-08-27T11:05:50.665762image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-08-27T11:05:50.756080image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-08-27T11:05:50.843747image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-08-27T11:05:50.931052image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-08-27T11:05:51.021279image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-08-27T11:05:51.107382image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-08-27T11:05:51.199997image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-08-27T11:05:51.286133image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-08-27T11:05:51.375582image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-08-27T11:05:51.475876image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-08-27T11:05:51.572123image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-08-27T11:05:51.668365image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-08-27T11:05:51.756906image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-08-27T11:05:51.848379image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-08-27T11:05:51.933673image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-08-27T11:05:52.024170image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-08-27T11:05:52.112019image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-08-27T11:05:52.200080image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-08-27T11:05:52.298511image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-08-27T11:05:52.400095image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-08-27T11:05:52.502947image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-08-27T11:05:52.602783image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-08-27T11:05:52.705186image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-08-27T11:05:52.795976image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-08-27T11:05:52.891521image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-08-27T11:05:52.982948image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-08-27T11:05:53.079930image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-08-27T11:05:53.193828image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-08-27T11:05:53.309261image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-08-27T11:05:53.415382image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-08-27T11:05:53.647090image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-08-27T11:05:53.761644image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-08-27T11:05:53.872345image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-08-27T11:05:53.985555image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-08-27T11:05:54.089017image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-08-27T11:05:54.188891image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Correlations

2021-08-27T11:05:57.545056image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-08-27T11:05:57.719675image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-08-27T11:05:57.893778image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-08-27T11:05:58.067924image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2021-08-27T11:05:58.205533image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2021-08-27T11:05:54.377856image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
A simple visualization of nullity by column.
2021-08-27T11:05:54.611753image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

df_indexAirlineSourceDestinationDurationTotal_StopsAdditional_Infoyear_of_Journeymonth_of_Journeyweek_of_Journeyday_of_Journeyhour_of_Journeyminute_of_JourneyPrice
003051704820193122422203897
1113044518201951815507662
224211140182019623992513882
33330325082019519121856218
44305285082019391165013302
5583014548201962624903873
6640593005201931112185511087
7740512650820193918022270
8840515300520193111285511087
996214700820195222711258625

Last rows

df_indexAirlineSourceDestinationDurationTotal_StopsAdditional_Infoyear_of_Journeymonth_of_Journeyweek_of_Journeyday_of_Journeyhour_of_Journeyminute_of_JourneyPrice
106721067342190018201952227132516704
1067310674405148505201931112203511087
10674106751438048201962396203100
1067510676621520082019518110209794
1067610677802160472019521215553257
1067710678030150482019415919554107
10678106791301554820194172720454145
1067910680402180482019417278207229
10680106811005160482019391113012648
10681106821215001820195199105511753